Dynamic and Approximate Pattern Matching in 2D

نویسندگان

  • Raphaël Clifford
  • Allyx Fontaine
  • Tatiana A. Starikovskaya
  • Hjalte Wedel Vildhøj
چکیده

We consider dynamic and online variants of 2D pattern matching between an m×m pattern and an n× n text. All the algorithms we give are randomised and give correct outputs with at least constant probability. – For dynamic 2D exact matching where updates change individual symbols in the text, we show updates can be performed in O(log n) time and queries in O(log m) time. – We then consider a model where an update is a new 2D pattern and a query is a location in the text. For this setting we show that Hamming distance queries can be answered in O(logm + H) time, where H is the relevant Hamming distance. – Extending this work to allow approximation, we give an efficient algorithm which returns a (1 + ε) approximation of the Hamming distance at a given location in O(ε−2 log m log log n) time. Finally, we consider a different setting inspired by previous work on locality sensitive hashing (LSH). Given a threshold k and after building the 2D text index and receiving a 2D query pattern, we must output a location where the Hamming distance is at most (1+ε)k as long as there exists a location where the Hamming distance is at most k. – For our LSH inspired 2D indexing problem, the text can be preprocessed in O(n log n) time into a data structure of size O(n) with query time O(nm).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The $\mathcal{E}$-Average Common Submatrix: Approximate Searching in a Restricted Neighborhood

This paper introduces a new (dis)similarity measure for 2D arrays, extending the Average Common Submatrix measure. This is accomplished by: (i) considering the frequency of matching patterns, (ii) restricting the pattern matching to a fixed-size neighborhood, and (iii) computing a distance-based approximate matching. This will achieve better performances with low execution time and larger infor...

متن کامل

Approximate String Matching with Ordered q-Grams

Approximate string matching with k differences is considered. Filtration of the text is a widely adopted technique to reduce the text area processed by dynamic programming. We present sublinear filtration algorithms based on the locations of q-grams in the pattern. Samples of q-grams are drawn from the text at fixed periods, and only if consecutive samples appear in the pattern approximately in...

متن کامل

Approximate string matching as an algebraic computation

Approximate string matching has a long history and employs a wide variety of methods (see e.g. the survey [2]). We consider a variant of approximate matching that compares a fixed pattern string to every substring in the text string by a rational-weighted edit distance (e.g. the indel distance, defined as the number of character insertions and deletions, or the indelsub/Levenshtein distance, wh...

متن کامل

A Comparative Study of Different Longest Common Subsequence Algorithms

The longest common subsequence is a classical problem which is solved by using the dynamic programming approach. The LCS problem has an optimal substructure: the problem can be broken down into smaller, simple "subproblems", which can be broken down into yet simpler subproblems, and so on, until, finally, the solution becomes trivial. The LCS problem also has overlapping subproblems: the soluti...

متن کامل

Adaptive Approximate Record Matching

Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016